Scheduling to Reduce Memory Coherence Overhead on Coarse-Grain Multiprocessors
نویسندگان
چکیده
Some Distributed Shared Memory (DSM) and Cache-Only Memory Architecture (COMA) multiprocessors keep processes near the data they reference by transparently replicating remote data in the processes' local memories. This automatic replication of data can impose substantial memory system overhead on an application since all replicated data must be kept coherent. We examine the eeect of task scheduling on data repli-cation and memory system overhead due to coherency requirements. We show that simple policies using programmer hints can reduce memory coherence overhead in our workload applications.
منابع مشابه
Scheduling to Reduce Memory Coherence Overhead on Coarse-grain Multiprocessors 1 Scheduling to Reduce Memory Coherence Overhead on Coarse-grain Multiprocessors
Some Distributed Shared Memory (DSM) and Cache-Only Memory Architecture (COMA) multiprocessors keep processes near the data they reference by transparently replicating remote data in the processes' local memories. This automatic replication of data can impose substantial memory system overhead on an application since all replicated data must be kept coherent. We examine the eeect of task schedu...
متن کاملMercury: Object-Affinity Scheduling and Continuation Passing on Multiprocessors
Mercury 12, 17] is a system designed to explore methods for improving the performance of \natural grain" parallel object-oriented programs on shared memory multiprocessors with hardware-coherent caches. The novel aspects of Mercury are a locality-conscious implementation of user-level threads, new scheduling techniques based on object aanity, and a lightweight task management mechanism that use...
متن کامل\threads: a System for the Support of Concurrent Programming". Technical Report
Many parallel applications are implemented using lightweight thread packages. The low overhead associated with user-level thread management encourages programmers to use threads to exploit ne-grain parallelism in an application. Although the overhead of explicit thread management can be very small, there is other overhead associated with lightweight threads: the time required to load data into ...
متن کاملComparative Evaluation of Fine- and Coarse-Grain Approaches for Software Distributed Shared Memory
Symmetric multiprocessors (SMPs) connected with low-latency networks provide attractive building blocks for software distributed shared memory systems. Two distinct approaches have been used: the fine-grain approach that instruments application loads and stores to support a small coherence granularity, and the coarse-grain approach based on virtual memory hardware that provides coherence at a p...
متن کاملSimulation study of memory performance of SMP multiprocessors running a TPC-W workload
The infrastructure to support electronic commerce is one of the areas where more processing power is needed. A multiprocessor system can offer advantages for running electronic commerce applications. The memory performance of an electronic commerce server, i.e. a system running electronic commerce applications, is evaluated in the case of shared-bus multiprocessor architecture. The software arc...
متن کامل